Efficient implementations of the Quantum Fourier Transformi an experimental 

perspective 
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The Quantum Fourier transform (QFT) is a key ingredient in most quantum algorithms. We have 
compared various spin-based quantum computing schemes to implement the QFT from the point 
of view of their actual time-costs and the accuracy of the implement ation. We focus here on an 
interesting decomposition of the QFT as a product of the non-selective Hadamard transformation 
followed by multiqubit gates corresponding to square- and higher-roots of controlled-NOT gates. 
This decomposition requires only 0(n) operations and is thus linear in the number of qubits n. 
The schemes were implemented on a two-qubit NMR quantum information processor and the resul- 
tant density matrices reconstructed using Standard quantum state tomography techniques. Their 
experimental fidelities have been measured and compared. 

PACS numbers: 03.67.Lx,76.70.-k 



I. INTRODUCTION 

The Fourier transform finds widespread application in 
physics and information processing, and it comes as no 
surprise that its quantum version lies at the core of most 
known quantum computational algorithms. The Quan- 
tum Fourier Transform (QFT) is analogous to the classi- 
cal Fast Fourier Transform (FFT), and by exploiting the 
advantages of quantum parallelism, can be computed ex- 
ponentially faster. However, this advantage cannot be 
used to speed up data processing tasks, since the indi- 
vidual Fourier transformed output amplitudes cannot be 
accessed by a measurement. What the QFT can achieve 
is an estimation of arbitrary quantum phases and an ex- 
traction of the periodicity of a function. Indeed, fast 
quantum algorithms for factoring fi], 0, §], finding dis- 
crete logarithms [Q and the more general algorithm for 
finding the stabilizer of an Abelian group |5|, rely cru- 
cially on this property of the QFT. 

Schemes to implement the QFT have been proposed 
using cavity QED and have been experimentally im- 
plemented using NMR 0, ||, [| . Despite several limi- 
tations (see for example the points made in [jïï] |Ï2| and 
similar reviews), NMR remains to date the only exist- 
ing quantum-computing technology. However, with ideas 
for quantum algorithms that work with expectation- 
value quantum computers Q and proposals for scal- 
able solidstate NMR quantum computing implementa- 
tions JÏ5, 16, Ï7fl, it is likely that spin-based implementa- 
tions will soon cmcrge as a viable technology for quantum 
computing. 

The key role played by the QFT in quantum algorithms 
makes it an attractive candidate for detailed investiga- 
tions of its experimental implementations. Issues of the 
actual time-cost of quantum algorithms as compared to 



their ideal computational cost have seldom been quan- 
titatively addressed |Ï8|, |ï^ |. However, these issues are 
relevant and need to be tackled for technology to keep 
pace with theoretical developments. This paper seeks 
to compare diffcrent decompositions of the QFT, with a 
view to finding the most efficient experimental spin-based 
quantum computing implementation. 



II. THE QFT AND ITS DECOMPOSITIONS 

The basis states that we consider are product states 
\a) = \a n -ia n -2—ao) = \a n -i) n -i &> —\ai)i <S> |a )o, 
which can be represented by binary integers 

n-l 
1=0 

q = 2™ is the dimension of the Hilbert space and n the 
number of qubits. 

In this basis, the QFT can be represented as a unitary 
operator T ', which transforms the basis states \a) into 

T\a) = ^Te^l<>\c). (1) 

The states |c) have the same form as \a). 

When applied to an arbitrary state 1^) = S a ^al a )' 
theQFTyields 

q-1 q-1 

^)=^A a \a)^^2C c \c) (2) 

a=0 c=0 

where the coemcients C c are the discrete Fourier trans- 
form of the input coefücients A a . 

The basis transformation of Eqn. [ï] can be written in 
terms of individual qubits as 
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where each qubit is in a state \p(<pj)) = (|0) 
e ï27T0j 1 1) ) / -y/2- The phascs arc dctcrmincd by 



i i—i—j 

fc=0 



a k 2? 



+k-n 



Equation || serves as the basis for implementing the 
QFT byone- and two-qubit operations. One implemen- 
tation jH, (H uses single-qubit Hadamard rotations 
gates Hj and two-qubit controlled-phase gate Bj & that 
act on the qubits j and k and are given by 



B j,k - 



( 1 

10 

1 

V e i0 i k 



where 9jk = 7r2 J is a conditional phase shift applied 
only if both qubits are in the state |1). In terms of these 
gates, the quantum circuit for n qubits is 



QTT n = (fíii?i i 2...-Bi jï i)(iÏ2-B2,3. 
■ ■ ■ {H n -i B n -i >n ) (H n ) 



■ B2,n)· 



(4) 



with the sequence of operations being performed from 
right to left. With this implementation, the bit vàlues of 
the result appear in reversed order. If a sequence reversal 
is required, this can be achieved by a sequence of SWAP 
operations on pairs of qubits. 

We shall denote this decomposition of the QFT, as 
"serial" . For n qubits, it requires a total of n Hj gates, 
n(n — l)/2 Bj^ gates and n/2 SWAP operations, leading 
to a computational complexity of 0(n 2 ). 

Individual Hadamard operations are qubit-selective 
and hence costlier than a total Hadamard operator that 
is applied on all qubits simultaneously. It would be thus 
desirable to have a decomposition of the QFT that in- 
volves a non-selective Hadamard transformation B. 

A more useful (for NMR) decomposition of the QFT 
can be obtained by noting that the Hadamard opera- 
tor is self-inverse and that a Hadamard rotation of the 
controlled-phase gate can be decomposed as a root of a 
controlled-NOT gate , 



HhB 



[UCNOTj^k 



where j is the control and k the target qubit. The global 
phase factor does not influence measurement results and 
is henceforth ignored. Further, using 

[H t ,B jlk ]=0,i£ j, k, 

the sequence of operations in Eqn. (Q) can be modified 
to 



where 



QTT n = [H T ][U l U 2 ...U n ^U n - l ] 



(5) 
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FIG. 1: (a) Quantum circuit for the serial implementation of 
the QFT using qubit-selective Hadamard rotations and two- 
qubit controlled-phase gates, (b) Circuit for the parallel im- 
plementation of the QFT using a non-selective Hadamard ro- 
tation on all qubits, and multiqubit gates. The readout of the 
QFT is performed in reverse order on the qubits, achieved by 
SWAP operations (not shown in the circuits). 



is the total non-selective Hadamard operator on 
qubits, i.e. a single tt/2 radio frequeney pulse. 
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ions, i.e. a smgie 717 z rauio j 
The U gates in Eqn. (||) are 

Ui = H2Bi, 2 H 2 

= (Ucnot)i, z 

= H n -i(Bi }n -iB2,n— l···-Bn-2,n— l)-^n-l 



U n -2 



a/2 



U 



n-l 



(UcNOT){{n-l (UcNOT^n-l ■•( U CNOT)n 
= (UcNOT){(n {UcNOT)\{l ~(UcNOT)n-i 



2, n-l 



(6) 



They correspond to single spin rotations conditioned on 
the status of all the other spins involved in the opera- 
tion. Since they are single spin rotations, they can be 
implemented as single radio frequeney pulses. The con- 
ditioning on the state of the other spins is achieved by 
making them selective on specific transitions. As an ex- 
ample, for a system of three qubits, the operation U n -\ 
in this case is given by a fourth-root of a controlled-NOT 
gate on qubits one and three, followed by a square-root of 
a controlled-NOT gate on qubits two and three, with the 
third qubit being the target qubit in both cases. These 
gates thus involve three transitions of the third qubit: 
100 -> 101, 110 -> 111, and 010 -> 011. Since these 
are unconnected transitions, rotations in the subspace of 
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these transitions can be achieved simultaneously. Many 
pulsed irradiation schemes for such precise selective ex- 
citation exist in NMR, mostly involving shaping the ex- 
citation profile of the rf waveform |22] . The entire QFT 
operation in this decomposition therefore reduces to a 
sequence of n radio frequency pulses. It scales linearly 
with the number of qubits, and we will denote it as the 
"parallel" implementation. 



III. TIME-COST OF THE QFT 

The main issue in the experimental implementation of 
quantum algorithms is not the number of logical opera- 
tions per se, but the actual time-cost of each logical oper- 
ation/quantum gate. The U transformations in Eqn. (|5|) 
are no longer two-qubit phase-shift gates but correspond 
to square- and higher-roots of controlled-NOT operations 
on specific qubits. They can be implcmcnted experimen- 
tally using multiqubit gates that perform manipulations 
on qubits simultaneously. Since the NMR Hamiltonian 
has terms connecting múltiple pairs of qubits, such mul- 
tiqubit gates can be directly implemented. The quantum 
circuits for both serial and parallel implementations of 
the QFT are shown in Figure |ï|. 

The most expensive operation in the serial implemen- 
tation of the QFT is the controlled-phase shift gate Bjk- 
The ideal time-cost is computed assuming all gates take 
the same amount of time. However experimentally, the 
controlled-phase shift gate requires a time Tjk propor- 
tional to the desired phase rotation angle 9jk (related 
to the "distance" (k — j) of the qubits), and inversely 
proportional to the interaction Jjk between the qubits. 
The magnitude of the interaction Jjf. and hence the time 
cost depends on the specific experimental quantum com- 
puting technology under consideration. For liquid-state 
NMR, Jjk is the electron-mediated scalar coupling be- 
tween the qubits. For our calculations, we assume the 
Jjfc's to be of the same order of magnitude for all qubits, 
represented by a universal constant coupling J. The ac- 
tual time cost of the serial decomposition of the QFT, 
involving only one-qubit Hadamards and the two-qubit 
phase-controlled gate is 

n — ln n — 1 n 

T ser = n6 + Y / E r í* = riS + nJ2 E 

j=0 k=j+l j=0 k=j+l 

= nS + K(n-l + 2- n ) 
« 0(n) (7) 

where S is the time-cost of each single-qubit Hadamard 
rotation and k = 7r/J. Using multiqubit gates in the 
parallel implementation of the QFT reduces the actual 
time-cost of the algorithm. Quite apart from the saving 
obtained by using a non-selective Hadamard transforma- 
tion in the beginning on all the qubits, each U gate can 
be thought of as having components from one or more 



Bjk gates. The actual time-cost of the parallel QFT is 

n— 1 n 

Tpar = ^ T jk (8) 

j=0 k=j+l 

Since for multiqubit gates, the system evolves under more 
than one coupling period simultaneously, only the largest 
of these need be counted for contribution to the time-cost 
and the inner sum in Eqn. (||) vanishes to give 

n-i 

T par = k^2 _1 = kïi/2 k, 0(n) (9) 
j=o 

The analysis does not include the degradation of each 
gate due to decoherence nor does it take into account the 
SWAP operations, since the latter can in most cases be 
avoided by a relabeling of qubits. Implementing the Ap- 
proximate QFT ]Ï9| would require fewer controlled-phase 
gates but would correspondingly reduce the accuracy. 

IV. NMR IMPLEMENTATIONS OF THE QFT 

Experiments were performed on a de-gassed, flame- 
sealed sample of 13 C-labeled chloroform, with 13 C and 1 il 
as the two qubits and a coupling constant of J12 ~ 215Hz. 
Qubit-selective 90 degree pulses are of the order of 10/is. 
The unitary transformations required for the parallel 
decomposition of the QFT can be implemented either 
by transition-selective pulses or by J-coupling intervals 
sandwiched between qubit-selective pulses. A low-power 
rectangular pulse of length 6.5ms was used to selectively 
excite individual transitions for the selective implemen- 
tation of the QFT. For heteronuclear systems, RF pulses 
are applied on two different channels, leading to a reduc- 
tion in the duration of selective pulses. 

Each version of the QFT was implemented on a tempo- 
rally averaged pseudopure state B3J , obtained from the 
thermal equilibrium ensemble as the sum of three exper- 
iments 

E (do-nothing operation) 
{90.} c -^{90j c {9(U ff -^{90,} ff 

{Wx} H 7^r{Wx} C {Wy} H 7^-{W y } C 
ZJ12 Un 

The details of the pulse sequences used to implement the 
serial, parallel and the selective-pulse (parallel) decompo- 
sitions of the QFT, are given in Table | The final SWAP 
operation was not executed; instead the readout in the 
reverse order was achieved by "relabeling" the qubits at 
the end of each experiment. 

The results of all three implementations of the QFT are 
shown in Figure ^, using three-dimensional bar graphs to 
represent components of the final density matrix. Since 
only single-quantum terms are observable in NMR, it is 
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FIG. 2: The experimental deviation density matrices (a) for 
the pseudopure state 1 00) and for the states obtained after 
applying the QFT to the pseudopure state using the pulse 
sequence for (b) the selective implementation (c) the paral- 
lel implementation and (d) the serial implementation of the 
QFT. The rows are labelled in the Standard computational 
basis. 

necessary to perform a series of experiments that rotate 
unobservable terms into observable ones, in order to sam- 
ple the entire density matrix. The density matrix af- 
ter each implementation of the QFT was reconstructed 
by Standard quantum state tomography procedures, us- 
ing a set of nine experiments and qubit-selective read- 

outs mm- 

The precision of the QFT implementation can be esti- 
mated by measuring its "frdelity" , defined for mixed den- 
sity matrices (such as the ones encountered in NMR) f| 

F _ Tr(p th p exp ) j Tr{p 2 exp ) 
' y/T^^Tr{^ xp )^Tr{^ t ) 

The firat term in the expression measures the correla- 
tion between the experimental deviation density matrix 
p exp and the theoretical deviation density matrix p t h (ob- 
tained by "applying" the unitary operator corresponding 
to the ideal QFT transformation, to the initial density 
matrix pinit)- The second term is the weighting factor to 
take into account the overall signal loss due to decoher- 
ence during the experiment. 

The fidelities measured for the serial, parallcl and 
selective-pulse versions of the QFT are 79%, 80% and 
85% respectively. The reduction in frdelity is mainly due 
to imperfections in pulse calibration as well as system de- 
coherence. It is not surprising that the serial and paral·lel 



versions are equally accurate for the case of two qubits. 
The savings in time and the increase in accuracy of the 
paral·lel QFT will be realised only for a larger number of 
qubits. The better performance of the selective scheme is 
due to the fact that such a direct implementation of the 
square-root of the controlled-NOT gate does not require 
refocusing schemes p6| , p7) . However, in systems with 
a larger number of qubits such selective pulse schemes 
might not be feasible, the major stumbling blocks in such 
cases being decoherence during the pulses and the over- 
lap of transitions in crowded spectra. 

V. OTHER SPIN-BASED ARCHITECTURES 

Recently, several approaches have been suggested for 
the design of solid-state spin-based quantum computers. 
Kane's proposal |Ï5| using singlc donor spins in Si, ad- 
dresses the problem of scalability but has the disadvan- 
tages inherent in single-spin measurements. Ladd et. al.'s 
solid-state NMR quantum computing device on the other 
hand, is made entirely of silicon, with the qubits being 
spin-1/2 nuclei located in isolated atòmic chains JÏ6) . 
Suter et. al. pj] proposed an alternative architecture 
with each logical qubit being represented by two phys- 
ical qubits - an active electron spin to manipulate quan- 
tum information and a passive nuclear spin to store in- 
formation. A logical qubit is addressed using magnètic 
field gradients and SWAP gates, realised as a cascade of 
three transition-selective pulses, are used to convert be- 
tween active and passive states. A bàsic two-qubit gate 
relies on the dipolar interaction between electron spins 
and requires four additional SWAP gates, two to switch 
between active and passive states and two back-SWAPS 
to switch off the interaction between the neighbouring 
qubits. The hyperfine interaction is of the order of a 
few MHz and the electron dipolar interaction strength is 
around 10-50 MHz. An estimate of the actual time-cost 
of the QFT for such a solid-state spin quantum computer 
yields 

n — 1 n 

T ser = nS + (rjk +2*t S wap) 

j=0 k=j+l 

n — 1 n 

= ?i<5 + 2nA + k ^ 2Í ~ k 

3=0 k=j+l 

= nS + 2nA + n{n - 1 + 2 _n ) « 0(n) (10) 

where S is the time-cost of each single-qubit Hadamard 
rotation, k — n/d, d is the strength of the dipolar inter- 
action and A is the time unit of one SWAP gate. Since 
the gate times for this implementation are very fast, a 
greater number of logical operations compared to liquid- 
state NMR computers, can be performed within the sys- 
tem decoherence limit. These solid-state proposals are 
also scalable to a very large number of qubits. 

In conclusion, we have estimated the realistic time- 
costs of different decompositions of the QFT for liquid 
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and solid-state NMR quantum computers and have mea- 
sured the accuracy of the implemcntations experimen- 
tally using liquid-state NMR. While all quantum com- 
putation can be implemented using the two-qubit uni- 
versal controlled-NOT gate and one-qubit rotations, the 
number of these bàsic operations increases exponentially 
with the number of qubits. It has been suggested that 
for some specific QC purposes, using more complicated 
multiqubit gates might be computationally more efH- 
cient |2^, gíj. The parallcl implcmcntation of the QFT 
suggested by Cory et. al., using multiqubit gates, per- 
forms better than the serial implementation. The ac- 
tual experimental time-costs can be improved upon us- 



ing innovative techniques like multiqubit gates, creative 
refocusing schemes ||30| and time-optimal gates designed 
using control theory [pï| |32[ . 
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TABLE I: NMR pulse schemes for different implementations the QFT on n = 2 and 3 qubits. Superscript r — > s indicates 
a selective RF pulse on the transition r — > s. Superscripts r and r,s indicate spin-selective pulses on the spins r and r,s 
respectively. Subscript z indicates a composite-z pulse which can be expanded as a sandwich of rf pulses {8 Z } = {90, }{#,,} {90,}. 


Implementation 


n = 2 


n — 3 


Serial QFT 


{90,} 1 {180 :c } 1 dll{90,} 1 ' 2 {454 1 ' 2 

{go-j 1 - 2 ^} 2 ^,} 2 


{45 a } 3 { 180, } 3 {45_ H } 3 { 180, } 3 ^ 
{180-4 3 {90 y } 2 · 3 {454 2 ' 3 {90- !/ } 2 · 3 {45 !/ } 2 

{ÍSO,} 2 ^} 2 ^,} 1 ^^-,} 1 
{90J 1 · 3 {22.54 1 · 3 {90- H } 1 · 3 {1804 2 1 ^ 
{180 ;c } 2 {90 H } 1 · 2 {45 :c } 1 ' 2 {90_ !/ } 1 · 2 {45 a } 1 
{1804H45-J 1 


Parallel QFT 




{45 !/ } 1 · 2 · 3 {1804 1 ' 2 ' 3 {45- H } 1 ' 2 ' 3 {90_,} 2 i 1 ;7 
{90 y } 2 {454 2 {1804 1 {90 y } 1 ' 2 {454 1 · 2 

{67.5,} 1 {135_ z } 3 {90_ z } 2 


Selective-pulse 
(parallel QFT) 


{90 !/ } 1 ' 2 {1804 1 ' 2 {90 :c } 3 ^ 4 {45 z } 1 


{90 3/ } 1 · 2 · 3 {1804 1 · 2 · 3 {904 6 ^ 8 {904 5 ^ 7 
{180, } 7 ^ 8 {90, } 5 ^ 6 {90, } 3 ^ 4 {67.5 Z Y {45 z } 2 



